Higher-Order Intensional Type Analysis in Type-Erasure Semantics
نویسنده
چکیده
Higher-order intensional type analysis is a way of defining type-indexed operations, such as map, fold and zip, based on run-time type information. However, languages supporting this facility are naturally defined with a type-passing semantics, which suffers from a number of drawbacks. This paper, describes how to recast higher-order intensional type analysis in a type-erasure semantics. The resulting language is simple and easy to implement—we present a prototype implementation of the necessary machinery as a small Haskell library. 1 Polytypic programming Some functions are naturally defined by the type structure of their arguments. For example, a polytypic pretty printer can format any data structure by using type information to decompose it into basic parts. Without such a mechanism, one must write separate pretty printers for all data types and constantly update them as data types evolve. Polytypic programming simplifies the maintenance of software by allowing functions to automatically adapt to changes in the representation of data. Other classic examples of polytypic operations include reductions, comparison functions and mapping functions. The theory behind such operations has been developed in a variety of frameworks [1, 2, 5, 7, 11, 12, 13, 14, 21, 24, 26, 28]. While many of these frameworks generate polytypic operations at compile time (through a source-to-source translation determined by static type information), higher-order intensional type analysis [31] defines polytypic operations with run-time type information. Run-time type analysis has two advantages over static forms of polytypism: First, runtime analysis may index polytypic operations by types that are not known at compile time, allowing the language to support separate compilation, dynamic loading and polymorphic recursion. Second, run-time analysis may index Draft of July 2003 polytypic operations by universal and existential types. Because these quantified types hide information, the semantics of the programming language must provide type information at run time to define most operations over these types. Run-time type analysis is naturally defined by a typepassing semantics because types play an essential role in the execution of programs. However, there are several significant reasons to prefer a semantics where types are erased prior to execution: • A type-passing semantics always constructs and passes type information to polymorphic functions. It cannot support abstract data types because the identity of any type may be determined at run time. Furthermore, parametricity theorems [20, 27] about polymorphic terms are not valid with this semantics. • Because both terms and type constructors describe runtime behavior, type passing results in considerable complexity in the semantics of languages that precisely describe execution. For example, a language that makes memory allocation explicit [16, 17] uses a formal heap to model how data is stored; with run-time types it is necessary to add a second heap (and all the attendant machinery) for type data. • Operators that implement type analysis in a typeerasure semantics are easier to incorporate with existing languages (such as Haskell and ML) that already have this form of semantics. Extending these languages with this form of type analysis does not require global changes to their implementations. In fact, for some languages it is possible to define type analysis operators with library routines written in that language. For example, Weirich [30] shows how to encode firstorder run-time type analysis in Fω [9] and Cheney and Hinze [3] implement the same capabilities in the Haskell language [19]. In first-order intensional type analysis, types such as int and bool × string are the subject of analysis—an operator called typerec computes a catamorphism over the structure of run-time types. The idea behind higher-order intensional analysis is that the structure of parameterized types (i.e. higher-order type constructors) is examined. In this framework, typerec acts like an environment-based interpreter of the type language during execution. Higher-order analysis can define more polytypic operations than first order analysis. For example, a polytypic function that counts the Type analysis Semantics λR [6] First-order Type-erasure LH [31] Higher-order Type-passing LHR Higher-order Type-erasure Figure 1: Language comparison number of values of type α in a parameterized data structure of type τα must analyze the type constructor τ . Many of the most important examples of polytypic programming are only definable by higher-order analysis, including maps, zips, folds and reductions. Crary, Weirich and Morrisett [6] (CWM) describe how to support first-order intensional type analysis in a language with a type-erasure semantics. In their language λR, typerec examines terms that represent types instead of analyzing types. In a type-erasure version of higher-order analysis, typerec should examine term representations of higher-order type constructors. However, while CWM define representations of higher-order type constructors in λR, these representations cannot be used for higher-order analysis. For technical reasons discussed in Section 3, we cannot define a term that operates over these type constructor representations in the same way as the type-passing typerec term operates over type constructors. These difficulties prohibit an easy definition of a type-erasure language that may define higher-order polytypic operations. In this paper, we show how to reconcile higher-order analysis with type erasure. Our specific contributions include: • A language, called LHR, that supports higher-order intensional type analysis in a type-erasure semantics. Surprisingly, in some respects LHR is a simpler calculus than the type-passing version of higher-order type analysis. • A translation between the type-passing version of higher-order intensional type analysis and LHR, with a proof of correctness. • A prototype implementation of LHR as a Haskell library that is simple, easy to use and specialized to the Haskell type system, allowing polytypic functions to operate over built-in Haskell datatypes. The structure of this paper is as follows. Section 2 reviews higher-order intensional type analysis (formalized with the language LH) and Section 3 discusses the problems with defining a type-erasure version of this language. In Section 4 we present the type-erasure language called LHR. We describe the translation between LH and LHR in Section 5. Section 6 describes the prototype implementation of LHR as a Haskell library. In Section 7 we discuss extensions of this translation, and in Section 8 we present related work and conclude. Appendix A contains the proof of correctness of the translation. 2 LH: Higher-order analysis with type-passing The LH language (Figure 2) is a lightweight characterization of higher-order intensional type analysis that captures the core ideas of the language of Weirich [31]. It is a call-byname variant of the Girard-Reynolds polymorphic lambda (kinds) κ ::= ⋆ | κ1 → κ2 (operators) ⊕ ::= int | → | ∀⋆ (type constructors) τ ::=α | λα:κ.τ | τ1τ2 | ⊕ (types) σ ::= τ | int | σ1 → σ2 | ∀α:κ.σ (terms) e ::= i | x | λx:σ.e | e1e2 | Λα:κ.e | e[τ ] | typerec [∆, η, ρ][τ ]〈τ : κ〉 of θ (typerec branches) θ ::= ∅ | θ{⊕ ⇒ e} (term environment) η ::= ∅ | η{α ⇒ e} (tycon environment) ρ ::= ∅ | ρ{α ⇒ τ} (tycon context) ∆ ::= ∅ | ∆{α ⇒ κ} (term context) Γ ::= ∅ | Γ{x ⇒ σ} (operator signature) Σ ::= { int ⇒ ⋆, → ⇒ ⋆ → ⋆ → ⋆, ∀⋆ ⇒ (⋆ → ⋆) → ⋆} Figure 2: Syntax of LH calculus [10, 9, 20] plus the typerec term to define polytypic operations. The choice of call-by-value or call-by-name is not significant, and call-by-name slightly simplifies the presentation. Also for simplicity, the formal language contains only integers, functions, and polymorphic terms, although we will include additional forms (such as products, sums, and term and type recursion, with their usual semantics) in the examples. The behavior of typerec on these new type forms is analogous to that for integers, functions and polymorphic types. Types, σ, which describe terms, are separated from type constructors, τ , although we often call type constructors of base kind, ⋆, types. The operators, ⊕, are a set of constants of the type constructor language. These constants correspond to the various forms of types: for example, the constant → applied to τ1 and τ2 is equivalent to the function type τ1 → τ2, and ∀⋆τ is equivalent to the type ∀α: ⋆ .τα. 2 The signature, Σ, is a fixed finite map that describes the kinds of the operators. We use the notation Σ(⊕) to refer to the kind of the operator ⊕. The language includes several other finite maps, such as ρ, η, θ, etc. We write the empty map as ∅, add a new binding to ρ with ρ{α ⇒ τ} (defined only when α 6∈ Dom(ρ)) and retrieve a binding with ρ(α) (defined only when α ∈ Dom(ρ)). We also use the notation ρ(τ ) to substitute in τ for all variables bound in ρ. The notation for the other maps is analogous. The term typerec [∆, η, ρ][τ ]〈τ : κ〉 of θ defines polytypic operations. Essentially, it behaves like an interpreter of the type constructor language, translating the type constructor τ (of kind κ) to an element of the term language using the branches θ for the interpretation of operators and the environment η for the interpretation of type variables. The typerec term is the binding occurrence for the variables that might appear in τ at run-time—those that have a definition in the environment η. The context ∆ describes the kinds of those variables. The finite map ρ defines a substitution for the variables when τ appears outside of the scope Unlike other languages with intensional type analysis such as
منابع مشابه
Fully Reflexive Intensional Type Analysis in Type Erasure Semantics∗
Compilers for polymorphic languages must support runtime type analysis over arbitrary source language types for coding applications like garbage collection, dynamic linking, pickling, etc. On the other hand, compilers are increasingly being geared to generate type-safe object code. Therefore, it is important to support runtime type analysis in a framework that generates type correct object code...
متن کاملType-safe run-time polytypic programming
Polytypic programming is a way of defining type-indexed operations, such as map, fold and zip, based on type information. Run-time polytypic programming allows that type information to be dynamically computed—this support is essential in modern programming languages that support separate compilation, first-class type abstraction, or polymorphic recursion. However, in previous work we defined ru...
متن کاملA Higher-order Fine-grained Logic for Intensional Semantics
This paper describes a higher-order logic with fine-grained intensionality (FIL). Unlike traditional Montogovian type theory, intensionality is treated as basic, rather than derived through possible worlds. This allows for fine-grained intensionality without impossible worlds. Possible worlds and modalities are defined algebraically. The proof theory for FIL is given as a set of tableau rules, ...
متن کاملIntensional Investigations
This thesis is about the theory and practice of intensional semantics. Traditional denotational models of programming languages are usually extensional in that they concern themselves only with input/output properties of programs. The meaning of a program is typically taken to be a function from input to output containing no information about the way that function computes its result. In an int...
متن کاملProgram Semantics, Intensional Logic and Compositionality
We apply intensional logic to the semantics of an Algol-like programming language. This associates with expressions their meanings relative to \possible worlds", here interpreted as machine states. These meanings lie in the semantic domains of a higher order typed intensional logic. The great advantage of this approach is that it preserves compositionality of the meaning function, even in \opaq...
متن کامل